Design of optimal Elman Recurrent Neural Network based prediction approach for biofuel production

Renewable sources like biofuels have gained significant attention to meet the rising demands of energy supply. Biofuels find useful in several domains of energy generation such as electricity, power, or transportation. Due to the environmental benefits of biofuel, it has gained significant attention in the automotive fuel market. Since the handiness of biofuels become essential, effective models are required to handle and predict the biofuel production in realtime. Deep learning techniques have become a significant technique to model and optimize bioprocesses. In this view, this study designs a new optimal Elman Recurrent Neural Network (OERNN) based prediction model for biofuel prediction, called OERNN-BPP. The OERNN-BPP technique pre-processes the raw data by the use of empirical mode decomposition and fine to coarse reconstruction model. In addition, ERNN model is applied to predict the productivity of biofuel. In order to improve the predictive performance of the ERNN model, a hyperparameter optimization process takes place using political optimizer (PO). The PO is used to optimally select the hyper parameters of the ERNN such as learning rate, batch size, momentum, and weight decay. On the benchmark dataset, a sizable number of simulations are run, and the outcomes are examined from several angles. The simulation results demonstrated the suggested model's advantage over more current methods for estimating the output of biofuels.


Scientific Reports
| (2023) 13:8565 | https://doi.org/10.1038/s41598-023-34764-x www.nature.com/scientificreports/ optimize and model bioprocess. Over the previous years, artificial neural network (ANN) was used in nonlinear, multidimensional development and research of bioprocess. They have proved their effectiveness in emerging bioprocesses model lacking of previous data on them etabolic and kinetics flow occurs in cell and cells surrounding 7 . Moreover, ANN is dependent fully on data, with no previous experience on the event regulate the procedure 8 . The appeal of ANN as modelling tools derive from their exclusive function of processing data i.e., allocated high parallelism, primarily-linearity, and noise and error acceptance-and their ability to generalize and learn. ANN has received more interest from substantial soft computing tool which is constrained only for data analysis and processing, however, could also be employed for solving problem in nonlinear and multifaceted procedures 9 .
In recent times, deep learning and machine learning methods have been widespread in handling, modeling and optimizing the biodiesel consumption, production and its environmental impact by taking into account the effects of parameter on biofuel yields since production of a preferred products need an efficient usage of investigational models. This method provides a self-governing modeling method to the nature of procedure or its arithmetical model as well as capable of modelling the procedure using higher performances 10 .
In this research, a new optimal Elman Recurrent Neural Network (OERNN) based prediction model for biofuel prediction is proposed which provides better result when compared with the other existing approaches. The OERNN-BPP technique involves empirical mode decomposition (EMD) based pre-processing and fine to coarse (FTC) based reconstruction model. Besides, ERNN model is employed for the prediction of biofuel productivity. For enhancing the predictive performance of the ERNN model, a hyper parameter optimization process takes place using political optimizer (PO). A comprehensive experimental analysis is carried out on benchmark dataset and the results are examined interms of diverse aspects.
Related works. This section provides numerous research studies that have been focussed based on the production of biofuel. To gain a better understanding of the literal works and the relevant research areas are summarized as follows. The production of biofuel based on spatial distribution was implemented by Elmore et al. 11 where this approach utilized the Moderate Resolution Imaging Spectrometer (MODIS). Here, the residue from the rice was employed to produce the biofuel. The accuracy and the flexibility rate were very high in this approach; on the other hand, there occurs a complexity in designing the spatial model. Chanthawong et al. 12 proposed two different types of approaches namely two stages least square and three stages least square for biofuel production in the Thailand market. This dual-stage approach is developed with minimum cost with very less dynamic model constitution. The accurate biogas prediction was developed by De et al. 13 that utilized the neural network model namely k-nearest neighbors (KNN) for the effective production of biofuel. Here the forecasting accuracy is very high with improved facility performances. But, there occur a few complexity issues during implementation. Then Dehghani et al. 14 demonstrated a future forecasting model based on the production of biofuel and this approach utilized seven biofuel technologies namely gas turbine, Combined Heat and Power (CHP) turbine, bio-pyrolysis, cellulosic bioethanol, grain bioethanol, torrefaction and biodiesel. Moreover, the execution performance and the accuracy were very high; but the research and development of this approach are not much efficient.
Radivojević et al. 12 introduced the Automated Recommendation Tool (ART), a tool which leverage ML and probabilistic approaches for guiding synthetic biology from the systematic fashion, with no requirement to complete mechanistic kind of biological scheme. The following engineering cycle's group of recommended strains, along with a probabilistic prediction of its production level, are provided by ART using sampling-based optimization.
Elveny et al. 13 presented a novel Machine Learning (ML) technique dependent upon Extreme Learning Machine (ELM) for modelling this essential value. The real database involving 483 actual datasets has been related to the output forecasted with ELM technique. In Cui et al. 14 , distinct ML techniques are estimated to be the primary time for establishing the forecast technique amongst biodiesel composition and cold filter plugging point (CFPP). The decision trees (DT) based techniques are optimum efficient in forecasting CFPP of biodiesel.
Kumar et al. 15 aimed in evolving a new Adaptive Integrated Optimization Network (AION) for attaining optimum biofuel production with maximal accuracy as well as minimal error value rates. Also, the presented AION manner includes 4 important stages as Pre-processing of data, Re-construction of components, Prediction of individuals, and Ensemble predicting. Javed et al. 16 developed a grey predicting technique with optimized the model frame work (data accumulation function and background value generation). The presented predicting technique, Even form of Grey Forecasting model (EGM) (1,1,α,θ) is a generalization procedure of the even procedure of grey predicting technique and their comparative efficiency turned out that commonly higher than that of the original technique.
Beeravalli et al. 17 search a new manner for classifying feed stock's utilizing secondary works data sources. Also, the maximum reliability of techniques utilized, the study analyzed investigating over 20 parameters of 106 feed stocks. The study established a rating scheme to Multi-Criteria Decision Analysis (MCDA) containing weighting to all parameters dependent upon expert opinion or statistical techniques namely Principal Component Analysis (PCA). The ranking method output afterward is fed as to Multivariate Regression (MVR) and Multilayer perceptron (MLP), for ranking feed stocks for producing the maximum quality maintainable biofuels to a specific place.
Geng et al. 18 resolved the influence of random fluctuation data and weak anti-interference capability in the Markov chain model by proposing a dynamic fuzzy grey-Markov prediction model for biofuel production forecasting, in order to improve the prediction performance of the conventional prediction methods based upon past production levels in conjunction with the factors of economy, governmental policies, and technological developments. Their empirical results demonstrated the superiority of the proposed fuzzy grey-Markov model relative to the benchmark prediction models. However, the biofuel production system is a complex system, which is affected easily by various factors such as the economy, governmental policies, resources, technological  21 ). Therefore, the traditional methods are not suitable for predicting biofuel production (Geng et al. 18 ).
The proposed model. In this study, a new OERNN-BPP technique has been presented to predict the productivity of biofuels. The OERNN-BPP technique follows four major processes namely EMD based processing, FTC based reconstruction, ERNN based prediction, and PO based hyper parameter optimization. The design of PO based hyper parameter tuning process assists in optimally adjusting the learning rate, batch size, momentum, and weight decay. Figure 1 demonstrates the overall block diagram of OERNN-BPP model. The detailed working of these processes is discussed in the succeeding sections.
EMD based data pre-processing. Initially, a decomposition method called EMD is used in separating raw complex data into relatively simple/uncomplicated data thus decreasing the complication problems. Next, the EMD method is related to other decomposition methods like the Fourier decomposition and wavelet decomposition approaches 22 . Now, the EMD method, i.e., a type of intuitive, self-adapting, empirical and direct method and is suited well for nonlinear and non-stationary data. Generally, the EMD method employed in decomposing the raw time sequence data to some periodic mode functions I MF contain independent data. Accurately, the IPMF fulfills 2 distinct criteria's which are given below. FTC based reconstruction. The FTC model reconstructs IMF as 2t-testing parts namely minimum and maximum frequency elements. Inaddition, thet-testing frequency element comprises distinct features involving the information about the centralized traits. So, a simpler structure of FTC model is applied for improving the accuracy and reducing the computational complexity. The FTC model encompasses a 2-stage process. Firstly, the preprocessed IMF attained from the earlier level gets inspected by the use of t-testing. The next level includes the choice of IMF similarity including unrelated divergences at a certain degree of confidence are reconstructed different elements. The classification of maximal and minimal frequency elements is then done using the IMF with closer IMF over t-testing if the IMF resemblance with irrelevant divergence at a certain level of confidence cannot be achieved.
Design of ERNN based predictive model. At this stage, the ERNN model receives the input and predicts the actual production of biofuels. The ERNN has been simple RNN is established by Elman in 1990 23 . Already www.nature.com/scientificreports/ known, are current network is a few benefits like consuming time series and non-linear forecast abilities, faster convergence, and further accurate mapping capability. They combine Elman neuralnetwork (ENN) with distinct regions to its purpose. During this network, the outcomes of the hidden layers (HL) were permitted to feedback on itself with butter layer is named as recurrent layers (RL). Figure 2 illustrates the architecture of ERNN model. This feedback enables ERNN to learn, recognise, and produce spatial patterns as well as temporal designs. One RL neuron with a constant weight of one connects all hidden neurons. As a result, the RL almost creates a copy of the HL's previous instantaneous state. Accordingly, the number of recurrent neurons is comparable to the number of hidden neurons. Each layer has several neurons that propagate information from one layer to the next by computing a nonlinear function of the inputs' weighted sum.
The multi-input ERNN technique has been demonstrated, in which the amount of neurons from the input layers are m and during the HL is n and one output unit. Assume that x it (i = 1,2, … , m) represents the group of input vector of neurons at time t, y t+1 implies the outcome of networks at time t + 1, z jt (j = 1,2,…,n) refers the outcome of HL neuron at time t, and (j = 1,2, …, n) defines the RL neuron. w ij signifies the weight that links the node i from the input layer neuron to node j from the HL. c j , are the weights which link the node j during the HL neurons to node under the RL and output correspondingly. The HL stage is as follows: the inputs of every neuron from the HLs are provided as: The outcomes of hidden neurons are provided as: where the sigmoid function from HL has been elected as activation function: f H (x) = 1/(1 + e −χ ). The outcome of HL is provided as 24 : where (x) refers the identity map as activation functions.
(3) www.nature.com/scientificreports/ Design of PO based hyper parameter tuning process. For enhancing the predictive outcome of the ERNN model, the hyperparameter tuning process is carried out using PO. PO is a recently developed Meta heuristic approach that depends on human behaviour and is stimulated from the multiphase PO. But, it has to be mentioned that the presented method isn't primary of these kinds. In PO, the concepts of politics are mappedfrom a distinct point of view and different from the current politics stimulated algorithm, and it is because of the 4 reasons. Initially, PO tries to model every major step in politics like party development, constituency distribution or party ticket, party transferring, election campaign, parliamentary affairs, and interparty election afterward governments creation. Next, PO presents a new location upgrading approach named RPPUS. This later represents the learning performance of the politician from the former election. Then, all the individuals'solutions assume a binary task: an election candidate and party member. With these concepts, every solution could be upgraded based on the 2 optimal solutions: constituency winner and party leader. Lastly, in order to enhance the result, intermediary solutions need to communicate and cooperate through a phase called parliamentary affairs. In PO, all the party members are regarded as a candidate solution in which its good will has deliberated the location in the search spaces. Furthermore, the calculation functions are processed in the course of the election stage whereas the numbers of votes attained by all the party members represent the fitness of candidate solutions. PO model is generated using the 5 major stages in the following: constituency allocation, party development, party transferring, election campaign, parliamentary affair, and interparty election 25 . It has to be noted that the initial stage (constituency allocation and party formation) is performed once for initializing and affects distinct parameters.
Party formation and constituency allocation. At first, the population P is divided into N party, whereas all the parties P i include N member (possible solutions). Furthermore, all jth members are referred to as P j and denoted as a d dimension vectors, in which the values d are the amount of input parameters of the processed problems and P j i,k denote kth dimensions of As above-mentioned, all the members have deliberated as an election candidate as well itsrole as a member party. Therefore, N constituency is made and have jth members of all contesting parties. Moreover, the leader of the ith party afterward calculating the fitness of each member is stated as P * i and the group of each party leader is given as P * . Incontrast, afterward the election, C * regroup the winner from each constituency called the parliamentarian, whereas C * j denote the winners of jth constituencies.
Election campaign. In this stage, party member is trying to improve their chance of being selected by altering their position based on the 3 factors. Firstly, they attempt for learning from prior knowledge with a new location upgrading approach named RPPUS as expressed in Eqs. (6) & (7). Next, all-party members are trying to upgrade their present location based on the party leader. Lastly, candidate position is updated regarding the constituency winners: To balance among exploitation and exploration, a stage named party switching is initiated afterward the election campaign stage. With adaptive parameters, called party switching rate, all party members P j could be elected and switch to few arbitrarily selected party P r . Henceforth, it is exchanged with the minimum fit party member P r .
Election. This stage's aim is to calculate the fitness of each candidate contest in constituency. Afterward, the party leader and constituency winner are upgraded by: Parliamentary affairs. Afterward defining the party leader and constituency winner (parliamentarian), all the parliamentarians aiming to enhance their performances by mimicking the cooperation and interaction of the winning candidate to manage the governments in the post-election stage. All the parliamentarians C j * update . It has to be pointed out that the movements are used only when the performances ofC r * are improved. Figure 3 depicts the flowchart of PO. Initially, the input parameters are initialized. Then the fitness function is calculated for all the individual. For these individuals the party leader and constituency winner is defined, where the position of party leader is updated through election campaign. Afterward through Parliamentary Affairs the party leader is defined and constituency winner, all the parliamentarians aiming to enhance their performances by mimicking the cooperation. In Election stage's it aim is to calculate the fitness of each candidate contest in constituency and update the previous position and fitness which results in performance improvement.
Experimental validation. This section analyses the OERNN-BPP technique's findings using annual biofuel production data that was gathered from China between January 2015 and June 2020 (https:// apps. fas. usda. gov/ newga inapi/ api/ Report/ Downl oadRe portB yFile Name? fileN ame= Biofu els% 20Ann ual_ Beiji ng_ China% 20-% 20Peo ple% 27s% 20Rep ublic% 20of_ CH2022-0089. pdf). The samples in the dataset are split into training data (80%) and testing data (20%), respectively. The OERNN-BPP technique's outcomes are analysed in a variety of dimensions. Mean absolute percentage error (MAPE). Mean absolute percentage error (MAPE) is a metric that defines the accuracy of a forecasting method. It represents the average of the absolute percentage errors of each entry in a dataset to calculate how accurate the forecasted quantities were in comparison with the actual quantities. Table 1 shows the OERNN-BPP Model parameter information. Firstly, a brief investigation of the biofuel production rate of the OERNN-BPP model is investigated for a period of 6 years (2015-200) in Fig. 4 and Table 2. The results demonstrated the original data, predicted data by OERNN-BPP technique, and divergence rate. The obtained values portrayed that the OERNN-BPP technique has effectively predicted the biofuel production rate and the divergence rate is found to be minimal. Moreover, the divergence at a gets increased with the increase in duration.
Another results analysis of the OERNN-BPP technique takes place interms of biofuel production cost for certain duration in Fig. 5. The figure portrayed that the OERNN-BPP technique has depicted effective performance with as light difference in the actual and predicted data. At the same time, the GWO based LSTM-RNN model has tried to demonstrate reasonable outcomes. However, the OERNN-BPP technique has out performed the existing one with a higher predictive outcome.   Table 3 and Fig. 6. The value of RMSE tends to be minimal for better prediction outcomes. The figure reported that the AWNN technique has appeared as the poor performer with the higher RMSE values. Followed by, the ARIMA model has gained slightly enhanced RMSE value over the AWNN technique whereas the GWO based LSTM-RNN technique has demonstrated moderately RMSE value. In line with, the AION technique has exhibited reasonably reduced RMSE value. However, the proposed OERNN-BPP technique is found to be efficient with the minimum RMSE values under varying time duration.
A detailed comparative MAPE analysis of the OERNN-BPP approach under varying time duration take place in Table 4 and Fig. 7 Finally, a CT analysis of the OERNN-BPP technique with existing techniques is made in Table 5 and Fig. 8. The experimental results highlighted that the GFMP technique has offered worse outcomes with the least CT of 1853.391. Besides, the EMD-LSTM-ELM and EMD-(GWO-LSTM-RNN + AWNN) techniques have tried to show moderate outcomes with the CT of2101.430 and 2356.358 respectively. However, the proposed OERNN-BPP technique has resulted in superior outcomes with a maximum CT of 3145.152. From the detailed result analysis, it is ensured that the OERNN-BPP technique is found to be an effective tool to predict bio fuel productivity.

Conclusion
In this study, a new OERNN-BPP technique has been presented to predict the productivity of bio fuels. The OERNN-BPP technique encompasses four major processors namely EMD based processing, FTC based reconstruction, ERNN based prediction, and PO based hyper parameter optimization. Once the input data

Data availability
The datasets used and analyzed during the current study are available from the corresponding author on request.